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Adversarial Search 




Single-Agent Trees 
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Value of a State 




Value of a state: 
The best achievable 
outcome (utility) 
from that state 
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Non-Terminal States: 


V(s) = max V(s') 

s'Gchildren(s) 



Terminal States: 


V (s) = known 



Adversarial Game Trees 
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Minimax Values 


States Under Agent's Control: 

V(s) = max 

s' E successors (s) 



States Under Opponent's Control 

V’(s') = min V (.s) ■ 

sEsuccessors(s') 
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Terminal States: 

V (s) = known 









Tic-Tac-Toe Game Tree 
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Adversarial Search (Mini max) 

Deterministic, zero-sum games: 

Tic-tac-toe, chess, checkers 
One player maximizes result 
The other minimizes result 

► Minimax search: 

► A state-space search tree 

Players alternate turns 

Compute each node’s minimax value: 
the best achievable utility against a 
rational (optimal) adversary 


Minimax values: 
computed recursive \ 



Terminal values: 
part of the game 






Minimax Implementation 


def max-value(state): 

initialize v = -oo 
for each successor of state: 

v = max(v, min-value(successor)) 
return v 



V(s) = max V(s') 

s' Gsuccessors(s) 
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def min-value(state): 

\ 

initialize v = +co 
for each successor of state: 
v= min(v, max-value(success 

return v 


^(s 7 ) = min V(s) 

s G successors ( s ' ) 


Minimax Implementation (Dispatch) 


def value(state): 

if the state is a terminal state: return the state’s 
utility 

if the next agent is AAAX: return max-value(state) 
if the next agent is MIN: return min-value(state) 
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def max-value(state): 

initialize v = -oo 
for each successor of state: 
v = max(v, value(successor)) 
return v 





def min-value(state): 
initialize v = +<x> 
for each successor of state: 
v = min(v, value(successor)) 

return v 


Minimax Example 












Minimax Efficiency 

► How efficient is minimax? 

► Just like (exhaustive) DFS 

► Time: 0(b m ) 

► Space: O(bm) 



Minimax Properties 
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Optimal against a perfect player. Otherwise? 



[Demo: min vs exp (L6D2, L6D3)] 







Evaluation Functions 





Evaluation Functions 

Evaluation functions score non-terminals in depth-limited search 
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Black to move 
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White to move 


White slightly better Black winning 

Ideal tunction: returns the actual minimax value ot tne position 
► In practice: typically weighted linear sum of features: 


► eEval(s) = u>i/i(s) + ^2/2(5) + • • • + Wnfn(s) 






Game Tree Pruning 





Minimax Example 












Minimax Pruning 










Alpha-Beta Pruning 

General configuration (MIN version) 

We’re computing the MIN-VALUE at some node n 

We’re looping over n’s children 

n’s estimate of the childrens’ min is dropping 

Who cares about n’s value? MAX 

Let a be the best value that MAX can get at any 
choice point along the current path from the root 

If n becomes worse than a, MAX will avoid it, so we 
can stop considering n’s other children (it’s already 
bad enough that it won’t be played) 


MAX 

MIN 

MAX 


MAX version is symmetric 


MIN 


Alpha-Beta Implementation 


~\ 


a: MAX’S best option on path to root 

6: MIN’s best option on path to root 

j 



def max-value(state, a, P): 

initialize v = -go 
for each successor of state: 

v = max(v, value(successor, a, p)) 
if v > P return v 
a = max(a, v) 
return v 


def min-value(state , a, P): 
initialize v = +co 
for each successor of state: 

v = min(v, value(successor, a, P ) 

if v < a return v 
p = min(p, v) 
return v 


Alpha- Beta Pruning Properties 


This pruning has no effect on minimax value computed for the root! 

Values of intermediate nodes might be wrong 

Important: children of the root may have the wrong value 
So the most naive version won’t let you do action selection 

Good child ordering improves effectiveness of pruning 

With “perfect ordering”: 

Time complexity drops to 0(b m/2 ) 

Doubles solvable depth! 

Full search of, e.g. chess, is still hopeless... 



This is a simple example of metareasoning (computing about what to compute) 





Thanks 


